Imbalanced Data Set CSVM Classification Method Based on Cluster Boundary Sampling
نویسندگان
چکیده
منابع مشابه
Imbalanced Data SVM Classification Method Based on Cluster Boundary Sampling and DT-KNN Pruning
This paper presents a SVM classification method based on cluster boundary sampling and sample pruning. We actively explore an effective solution to solve the difficult problem of imbalanced data set classification from data re-sampling and algorithm improving. Firstly, we creatively propose the method of cluster boundary sampling, using the clustering density threshold and the boundary density ...
متن کاملCUSBoost: Cluster-based Under-sampling with Boosting for Imbalanced Classification
Class imbalance classification is a challenging research problem in data mining and machine learning, as most of the real-life datasets are often imbalanced in nature. Existing learning algorithms maximise the classification accuracy by correctly classifying the majority class, but misclassify the minority class. However, the minority class instances are representing the concept with greater in...
متن کاملParallel selective sampling method for imbalanced and large data classification
Several applications aim to identify rare events from very large data sets. Classification algorithms may present great limitations on large data sets and show a performance degradation due to class imbalance. Many solutions have been presented in literature to deal with the problem of huge amount of data or imbalancing separately. In this paper we assessed the performances of a novel method, P...
متن کاملCluster-based under-sampling approaches for imbalanced data distributions
For classification problem, the training data will significantly influence the classification accuracy. However, the data in real-world applications often are imbalanced class distribution, that is, most of the data are in majority class and little data are in minority class. In this case, if all the data are used to be the training data, the classifier tends to predict that most of the incomin...
متن کاملCluster-Based Sampling Approaches to Imbalanced Data Distributions
For classification problem, the training data will significantly influence the classification accuracy. When the data set is highly unbalanced, classification algorithms tend to degenerate by assigning all cases to the most common outcome. Hence, it is important to select the suitable training data for classification in the imbalanced class distribution problem. In this paper, we propose cluste...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Mathematical Problems in Engineering
سال: 2016
ISSN: 1024-123X,1563-5147
DOI: 10.1155/2016/1540628